Errors-in-variables models with dependent measurements

نویسندگان

  • Mark Rudelson
  • Shuheng Zhou
  • S. Zhou
چکیده

Suppose that we observe y ∈ Rn and X ∈ Rn×m in the following errors-in-variables model: y = X0β ∗ + X = X0 +W where X0 is an n × m design matrix with independent subgaussian row vectors, ∈ Rn is a noise vector and W is a mean zero n × m random noise matrix with independent subgaussian column vectors, independent of X0 and . This model is significantly different from those analyzed in the literature in the sense that we allow the measurement error for each covariate to be a dependent vector across its n observations. Such error structures appear in the science literature when modeling the trial-to-trial fluctuations in response strength shared across a set of neurons. Under sparsity and restrictive eigenvalue type of conditions, we show that one is able to recover a sparse vector β∗ ∈ Rm from the model given a single observation matrix X and the response vector y. We establish consistency in estimating β∗ and obtain the rates of convergence in the q norm, where q = 1, 2 for the Lasso-type estimator, and for q ∈ [1, 2] for a Dantzig-type Conic programming estimator. We show error bounds which approach that of the regular Lasso and the Dantzig selector in case the errors in W are tending to 0. We analyze the convergence rates of the gradient descent methods for solving the nonconvex programs and show that the composite gradient descent algorithm is guaranteed to converge at a geometric rate to a neighborhood of the global minimizers: the size of the neighborhood is bounded by the statistical error in the 2 norm. Our analysis reveals interesting connections between computational and statistical efficiency and the concentration of measure phenomenon in random matrix theory. We provide simulation evidence illuminating the theoretical predictions. MSC 2010 subject classifications: Primary 60K35.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

حل معادلات برآوردکننده مدل‌های رگرسیون با اندازه خطای تصادفی روی متغیر مستقل به روش بهینه سازی

Measurements of some variables in statistical analysis are often encountered with random errors. Therefore, investigating of the effects of these errors seems to be important. This event in regression analysis seems to be more necessary. Because the aim of the fitting a regression model is estimating the effect of an independent variable on a response variable. Then measurements of an independe...

متن کامل

ارزیابی روش‌های مختلف مدل‌سازی روابط مؤلفه‌های باران‌نگار و آب‌نگار واحد

An extensive data collection on precipitation and runoff is required for development and implementation of soil and water projects. The unit hydrograph (UH) is an appropriate base for deriving flood hydrographs and therefore provides comprehensive information for planners and managers. However, UH derivation is not easy job for whole watersheds. The development of UH by using easily accessible ...

متن کامل

Evaluation of the Most Important Factors Affecting the Income of Taxes in the Economy of Iran with the Approach of TVP DMA Models

Due to the government’s roles and responsibilities in economy for financing public services through taxes as sustainable revenue, this study investigates not only the effective factors on tax revenues and its theoretical bases but also selects the related important effective variables in Iran’s economy during the period (1971-2017) by using dynamic models TVP - DMA. The classic models focus on ...

متن کامل

Integrated production-Inventory model with price-dependent demand, imperfect quality, and investment in quality and inspection

In practice, manufacturing systems are never perfect and may have low quality outputs. Therefore, different decisions such as reprocessing, sale at lower prices or diminishing are made according to industry and market. This paper investigates the importance of supply chain coordination through developing two models in centralized decision-making for an imperfect quality manufacturing system wit...

متن کامل

به‌کارگیری متغیرهای پنهان در مدل رگرسیون لجستیک برای حذف اثر هم‌خطی چندگانه در تحلیل برخی عوامل مرتبط با سرطان پستان

Background and Objectives: Logistic regression is one of the most widely used generalized linear models for analysis of the relationships between one or more explanatory variables and a categorical response. Strong correlations among explanatory variables (multicollinearity) reduce the efficiency of model to a considerable degree. In this study we used latent variables to reduce the effects of ...

متن کامل

Error Recovery by the Use of Sensory Feedback and Reference Measurements for Robotic Assembly

Industrial robots need instrument or parts transport to do which requires coordinate to show the robot’s instrument, parts and body. When investigating the robot location, we are usually interested in measuring its location relative to a reference coordinate system. In this system it is attempted to make the assemble direction smaller by designing the sensor board and making use of an instrumen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017